This notebook accompanies the article entitled:

Causal inference with multiple versions of treatment and application to personalized medicine

It contains the investigation on PDX data referenced as section 5 of the article.

1 Objectives and methods

We want to perform a quantitative evaluation of precision medicine (PM) strategies with observational data. So we will emulate target trials in the potential outcomes framework in order to estimate our causal effects of interest.

1.1 Target trials

Target trials to estimate causal effect of precision medicine (PM) algorithm versus different controls. Patients are first screened according to their eligibility for the algorithm: based on their genomic characteristics patients are recommended a specific treatment (eligible) or not (no eligible). Then eligible patients are randomized and assigned either to PM-directed arm or to one of the alternative control arms (CE_1, CE_2 or CE_3)

Target trials to estimate causal effect of precision medicine (PM) algorithm versus different controls. Patients are first screened according to their eligibility for the algorithm: based on their genomic characteristics patients are recommended a specific treatment (eligible) or not (no eligible). Then eligible patients are randomized and assigned either to PM-directed arm or to one of the alternative control arms (CE_1, CE_2 or CE_3)

We define 3 different target trials comparing a precision medicine arm with 3 different control arms therefore defining 3 different causal effects CE1/CE2/CE3.

Our first causal effect of interest quantifies the effect of PM versus a simple control, called CE1 in this document. In practice, this causal effect corresponds to the expected gain compared a single-version reference/standard of care. This reference treatment can also be understood as the absence of treatment in some cases.

Our second causal effect of interest quantifies the PM improvement compared to the treatments effectively given in the real cohort, i.e. the physician’s choice, called CE2 in this document. We assume that treatment assignments in observational data have their own rationale and we compare our PM algorithm to this rationale. In practice, this causal effect compare the validity of treatment assignment in the observational data and the potential re-assignement of treatments that would have been performed by the PM algorithm.

Our third causal effect of interest is to quantify the relevance of treatment assignment, i.e the potential improvement of PM treatment assignement compared to a random assignment of the same versions of treatment, called CE3 in this document.

All these effects are defined only for PM-eligible patients, i.e for patients whose mutations results in a personalized treatment recommendation. For non-eligible patients it does not make sense to quantify the impact of the PM algorithm and to compare it with a control.

1.2 Potential outcomes framework

We will use the potential outcomes framework to estimates the causal effects described in previous section with observational data. Below, we summarize very briefly the variables we use to model our precision medicine settings. Please refer to the article for a detailed description of potential outcomes framework, counterfactual variables and the impact of the multiplicity of versions of treatment.

Causal diagram in Precision Medicine

Causal diagram in Precision Medicine

We have:

  • C, the patient covariates, in our case mainly some biomarkers based on mutations that can help define the best personalized treatment.
  • A, the treatment status, in our case hase the patient been treated with an anti-cancer drug (\(A=1\)) or with the control treated/left untreated (\(A=0\))
  • K, the versions of treatment, in our case the precise drug given to the patient. For \(A=1\), K will be one of the three possible personalized drug, \(\mathcal{K}^{1} = \{k^{1,1}, k^{1,2}, k^{1,2}\}\). More simply, \(\mathcal{K}^{0}=\{k^0\}\) depends on the definition of the controls
  • Y, the treatment strategy outcome, like tumour size

2 PDX data

2.1 Background

We will base our analysis on a dataset of Patient-Derived Xenografts (PDX) from Gao et al.. These are patients tumours implanted in mice. Since you can implant several pieces of the same tumour in several mice, you have the opportunity to test different drugs on the same tumour (same patient of origin). In a way, we have access to some counterfactuals values. It may help us to evaluate our ability to recover real causal effects with dedicated statistical methods.

Each patient has been screened for different drugs (not the same subset of drugs for all patients). The response was determined by comparing tumor volume change at time \(t\) to tumor volume at time \(t_0\). Several metrics are computed:

\(TumourVolumeChange(\%) = \Delta Vol_t = 100\% \times \dfrac{V_t-V_{t_0}}{V_t}\)

\(BestResponse = min(\Delta Vol_t), t>10d\)

\(AvgResponse_t = mean(\Delta Vol_i, 0 \leq i\leq t)\)

\(BestAvgResponse = min(AvgResponse_t), t>10d\)

We will mainly focus on \(BestAvgResponse\). This metric “captures a combination of speed, strength and durability of response into a single value”. Qualitatively, lower values correspond to more efficient drugs.

We also define the \(ResponseCategory\) provided and based on mRECIST criteria. We re-process the data and define a binary variable \(ResponseBin\) which is 0 when the combination tumour-drug experienced a progressive disease and 1 otherwise (complete response, partial response or stable disease).

Last, but not least, we have omics profiles for many of these tumours with information on mutations/CNA and RNA. Only mutations/CNA are used in the present example.

2.2 Import data

Data is imported from Supplementary Material of the paper. Tissue of origin of the tumour is recovered from Xeva R package.

## [1] "Done"

2.3 Overview of data and summary plots

Before stdying precision medicine trials with PDX data, we provide below a very generic description of the whole PDX dataset for readers who would like to familiarize themselves with it and have a more global vision.

2.3.1 Size and sparsity

The dataset is composed of 281 PDX models and 63 have been tested. Nevertheless, the data matrix is quite sparse since not all drugs have been tested for all patients.

Some have been tested in a comprehensive subset of PDX models (LEE011, binimetinib…) but most of them have been tested accordig to cancer-specific patterns.

2.3.2 Efficience of drugs

Let’s first observe the differences between untreated and treated patients (whatever treatment they receive).

Treated tumours have, in average, decreased volume. Besides, response metric when treated and untreated are significantly correlated, supporting the hypothesis of a latent tumour aggressiveness factor.

Assuming \(BestAvgResponse\) when untreated is a good proxy for Aggressiveness, is it evenly distributed among cancer tissues?

Analysis per tissue, based on binary Responder status:

All in all, CM tumours appear a little bit more aggressive.

Now, what about difference between drugs?

Drug combinations are highly represented in the most effective drugs. All in all, there is a good “overlap” between drugs effects distribution: we do not observe completely distinct distributions with some drugs being systematically more efficient than others for all PDX models.

Can we go even further and say that all drugs have at least a few patients for whom they prove to be the best?

Not exactly. Some drugs (and especially combinations) win the gold medal more often but still the landscape is quite diverse and there is no panacea! It supports the fact that the best strategy is not to treat all patients with the same drugs but rather to adapt drugs to patients therefore supporting a precision medicine approach.

3 Causality and PDX for precision medicine

3.1 Define treatment strategy and patient profiles

Let’s now define our precision medicine treatment strategy, PM1:

  • PM1, based on mutation biomarkers
    • For mutations in PIK3CA –> BYL719, a PI3KCA inhibitor
    • For mutations in KRAS/BRAF –> binimetinib, a MAPK inhibitor

PTEN is also included in the genomic covariates of the model since it has been identified as a relevant predictor. LEE011 drug is considered as our standard non-targeted drug.

3.2 Data description

Now we will study the clincal impact of our PM algorithm on PDX models. To do this we focus the analysis on models eligible to this PM algorithm (i.e. mutated for BRAF/KRAS or PIK3CA) and with full data availibility (i.e. responses for binimetinib, BYL719 and LEE011 drugs).

This reduced cohort contains 88 patients. We plot below the distribution of tissues and biomarkers.

3.3 Check algorithm relevance

3.3.1 Influence of biomarkers

Let’s first check whether the algorithm has been designed in a meaningful way, consistent with the data.

Treatment assignment algorithm and observed drug sensitivities are consistent since mutated BRAF/KRAS tumours have a better binimetinib response and mutated PIK3CA tumours have a better BYL719 response. In addition, it can be noted that these biomarkers have deleterious cross-effects.

3.3.2 Counterfactual responses

We can also have a look at some treatment strategies variables in order to observe the agregated picture

3.4 Analysis with random cohorts

We sample 1000 cohorts of 70 patients (out of 88) and randomly assigned observed treatments for each patient. Then we compute the different estimates for all cohorts.

## ================================================================================
## [1] "Computation of estimates done"

3.5 Summary plots

3.6 Analysis with unbalanced cohorts

We sample 1000 cohorts of 70 patients (out of 88) and randomly assigned observed treatments for each patient. Then we compute the different estimates for all cohorts.

## ================================================================================

4 Analysis with a discrete outcome

We can reproduce very similar analyses replacing the continuous outcome by a binary one.

This reduced cohort still contains 88 patients with the same distribution of tissues and mutations

4.1 Counterfactual responses

What is the global landscape of drug sensitivities in this reduced cohort?

4.2 Analysis with random cohorts

Similarly, we sample 1000 cohorts of 70 patients (out of 88) and randomly assigned observed treatments for each patient. Then we compute the different estimates for all cohorts.

## ==============Error in glm.fit(x = structure(c(1.39909676787471, 0, 1.39909676787471,  : 
##   NA/NaN/Inf in 'y'
## ==================================================================
## [1] "Computation of estimates done"

4.3 Summary plots

## [1] "Computation done in:"
## 8396.022 sec elapsed